Assignment 11: Software Concurrent Threaded Pipelines

نویسنده

  • John Giacomoni
چکیده

Traditionally, increases in transistors and fabrication technology have led to increased performance. However, these techniques are showing diminishing returns due to limitations arising from power consumption, design complexity, and wire delays. In response, designers have turned to chip multiprocessors (CMPs) that incorporate multiple cores on a single die. The performance, cost, and flexibility of these CMP systems make them appealing for threaded applications. Unfortunately, popular threading techniques require independent code regions, use expensive synchronization primitives, and use expensive communication mechanisms. Concurrent Threaded Pipeline (CTP) architectures relax the data independence requirement and can increase computational throughput proportionately to the pipeline depth. Examples include Decoupled Software Pipelining, which focuses on compiler based extraction of pipelines from sequential codes, and the Frame Shared Memory architecture, which focuses specifically on network processing. CTP architectures show great promise for threading applications given a low-overhead high-speed blocking queue implementation. This dissertation presents a portable general purpose software framework for realizing pipeline throughput benefits in applications amenable to being pipelined. The general applicability of the technique is confirmed along with the two software components necessary for implementation. First, a novel software-only low-overhead high-speed blocking queue implementation suitable for CTPs is presented. Second, a scheduling system for gracefully handling over-subscribed situations is described.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

FastForward for Concurrent Threaded Pipelines

The performance, cost, and flexibility of commodity multi-core systems make them appealing for threaded applications. Unfortunately, popular threading techniques require independent code regions, use expensive synchronization primitives, and use expensive communication mechanisms. Recently, researchers have proposed several Concurrent Threaded Pipeline architectures (CTP) which relax the data i...

متن کامل

FastForward for Concurrent Threaded Pipelines ; CU-CS-1023-07

The performance, cost, and flexibility of commodity multi-core systems make them appealing for threaded applications. Unfortunately, popular threading techniques require independent code regions, use expensive synchronization primitives, and use expensive communication mechanisms. Recently, researchers have proposed several Concurrent Threaded Pipeline architectures (CTP) which relax the data i...

متن کامل

Harnessing Chip-Multiprocessors with Concurrent Threaded Pipelines ; CU-CS-1024-07

Single-core performance increases have stalled. To increase available cycles, microprocessor designers have shifted to chip-multiprocessor (CMP) designs. Unfortunately, the additional processors provided by CMPs may remain idle because most applications lack dataparallelism and task-parallelism is unlikely to saturate future CMP designs. The systems community needs to rethink how systems are st...

متن کامل

Harnessing Chip-Multiprocessors with Concurrent Threaded Pipelines

Single-core performance increases have stalled. To increase available cycles, microprocessor designers have shifted to chip-multiprocessor (CMP) designs. Unfortunately, the additional processors provided by CMPs may remain idle because most applications lack dataparallelism and task-parallelism is unlikely to saturate future CMP designs. The systems community needs to rethink how systems are st...

متن کامل

Building a Domain-Knowledge Guided System Software Environment to Achieve High-Performance of Multi-core Processors

Although multi-core processors have become dominant computing units in basic system platforms from laptops to supercomputers, software development for effectively running various multi-threaded applications on multi-cores has not made much progress, and effective solutions are still limited to high performance applications relying on exiting parallel computing technology. In practice, majority ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006